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Abstract 

We study the use of viral marketing strategies on social networks to maximize revenue from the sale of 
a single product. We propose a model in which the decision of a buyer to buy the product is influenced 
by friends that own the product and the price at which the product is offered. The influence model 
we analyze is quite general, naturally extending both the Linear Threshold model and the Independent 
Cascade model, while also incorporating price information. We consider sales proceeding in a cascading 
manner through the network, i.e. a buyer is offered the product via recommendations from its neighbors 
who own the product. In this setting, the seller influences events by offering a cashback to recommenders 
and by setting prices (via coupons or discounts) for each buyer in the social network. 

Finding a seller strategy which maximizes the expected revenue in this setting turns out to be NP- 
hard. However, we propose a seller strategy that generates revenue guaranteed to be within a constant 
factor of the optimal strategy in a wide variety of models. The strategy is based on an influence- and- 
exploit idea, and it consists of finding the right trade-off at each time step between: generating revenue 
from the current user versus offering the product for free and using the influence generated from this 
sale later in the process. We also show how local search can be used to improve the performance of this 
technique in practice. 

1 Introduction 

Social networks such as Facebook, Orkut and MySpace are free to join, and they attract vast numbers of 
users. Maintaining these websites for such a large group of users requires substantial investment from the 
host companies. To help recoup these investments, these companies often turn to monetizing the information 
that their users provide for free on these websites. This information includes both detailed profiles of users 
and also the network of social connections between the users. Not surprisingly, there is a widespread belief 
that this information could be a gold mine for targeted advertising and other online businesses. Nonetheless, 
much of this potential still remains untapped today. Facebook, for example, was valued at $15 billion by 
Microsoft in 2007 |13 , but its estimated revenue in 2008 was only $300 million 17 . With so many users 
and so much data, higher profits seem like they should be possible. Facebook's Beacon advertising system 
does attempt to provide targeted advertisements but it has only obtained limited success due to privacy 
concerns [16] . 

This raises the question of how companies can better monetize the already public data on social networks 
without requiring extra information and thereby compromising privacy. In particular, most large-scale 
monetization technologies currently used on social networks are modeled on the sponsored search paradigm 
of contextual advertising and do not effectively leverage the networked nature of the data. 

Recently, however, people have begun to consider a different monetization approach that is based on 
selling products through the spread of influence. Often, users can be convinced to purchase a product if 
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many of their friends are already using it, even if these same users would be hard to convince through direct 
advertising. This is often a result of personal recommendations - a friend's opinion can carry far more weight 
than an impersonal advertisement. In some cases, however, adoption among friends is important for even 
more practical reasons. For example, instant messenger users and cell phone users will want a product that 
allows them to talk easily and cheaply with their friends. Usually, this encourages them to adopt the same 
instant messenger program and the same cell phone carrier that their friends have. We refer the reader 
to previous work and the references therein for further explanations behind the motivation of the influence 
model §[4]. 

In fact, many sellers already do try to utilize influence- an d-exploit strategies that are based on these 
tendencies. In the advertising world, this has recently led to the adoption of viral marketing, where a 
seller attempts to artificially create word-of- mouth advertising among potential customers [8|[9|[l4] . A more 
powerful but riskier technique has been in use much longer: the seller gives out free samples or coupons to a 
limited set of people, hoping to convince these people to try out the product and then recommend it to their 
friends. Without any extra data, however, this forces sellers to make some very difficult decisions. Who do 
they give the free samples to? How many free samples do they need to give out? What incentives can they 
afford to give to recommenders without jeopardizing the overall profit too much? 

In this paper, we are interested in finding systematic answers to these questions. In general terms, we 
can model the spread of a product as a process on a social network. Each node represents a single person, 
and each edge represents a friendship. Initially, one or more nodes is "active" , meaning that person already 
has the product. This could either be a large set of nodes representing an established customer base, or it 
could be just one node - the seller - whose neighbors consist of people who independently trust the seller, 
or who are otherwise likely to be interested in early adoption. 

At this point, the seller can encourage the spread of influences in two ways. First of all, it can offer 
cashback rewards to individuals who recommend the product to their friends. This is often seen in practice 
with "referral bonuses" - each buyer can optionally name the person who referred them, and this person then 
receives a cash reward. This gives existing buyers an incentive to recommend the product to their friends. 
Secondly, a seller can offer discounts to specific people in order to encourage them to buy the product, above 
and beyond any recommendations they receive. It is important to choose a good discount from the beginning 
here. If the price is not acceptable when a prospective buyer first receives recommendations, they might not 
bother to reconsider even if the price is lowered later. 

After receiving discount offers and some set of recommendations, it is up to the prospective buyers to 
decide whether to actually go through with a purchase. In general, they will do so with some probability 
that is influenced by the discount and by the set of recommendations they have received. The form of this 
probability is a parameter of the model and it is determined by external factors, for instance, the quality 
of the product and various exogenous market conditions. While it is impossible for a seller to calculate the 
form of these probability exactly, they can estimate it from empirical observations, and use that estimate to 
inform their policies. One could interpret the probabilities according to a number of different models that 
have been proposed in the literature (for instance, the Independent Cascade and Linear Threshold models), 
and hence it is desirable for the seller to be able to come up with a strategy that is applicable to a wide 
variety of models. 

Now let us suppose that a seller has access to data from a social network such as Facebook, Orkut, or 
MySpace. Using this, the seller can estimate what the real, true, underlying friendship structure is, and 
while this estimate will not be perfect, it is getting better over time, and any information is better than 
none. With this information in hand, a seller can model the spread of influence quite accurately, and the 
formerly inscrutable problems of who to offer discounts to, and at what price, become algorithmic questions 
that one can legitimately hope to solve. For example, if a seller knows the structure of the network, she can 
locate individuals that are particularly well connected and do everything possible to ensure they adopt the 
product and exert their considerable influence. 

In this paper, we are interested in the algorithmic side of this question: Given the network structure and 
a model of the purchase probabilities, how should the seller decide to offer discounts and cashback rewards? 
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1.1 Our contributions 



We investigate seller strategies that address the above questions in the context of expected revenue maxi- 
mization. We will focus much of our attention on non-adaptive strategies for the seller: the seller chooses 
and commits to a discount coupon and cashback offer for each potential buyer before the cascade starts. If a 
recommendation is given to this node at any time, the price offered will be the one that the seller committed 
to initially, irrespective of the current state of the cascade. 

A wider class of strategies that one could consider are adaptive strategies, which do not have this restric- 
tion. For example, in an adaptive strategy, the seller could choose to observe the outcome of the (random) 
cascading process up until the last minute before making very well informed pricing decisions for each node. 
One might imagine that this additional flexibility could allow for potentially large improvements over non- 
adaptive strategies. Unfortunately, there is a price to be paid, in that good adaptive strategies are likely to 
be very complicated, and thus difficult and expensive to implement. The ratio of the revenue generated from 
the optimal adaptive strategy to the revenue generated from the optimal non-adaptive strategy is termed 
the "adaptivity gap" . 

Our main theoretical contribution is a very efficient non-adaptive strategy whose expected revenue is 
within a constant factor of the optimal revenue from an adaptive strategy. This guarantee holds for a wide 
variety of probability functions, including natural extensions of both the Linear Threshold and Independent 
Cascade model^] Note that a surprising consequence of this result is that the adaptivity gap is constant, 
so one can make the case that not much is lost by restricting our attention to non- adaptive policies. We 
also show that the problem of finding an optimal non- adaptive strategy is NP-hard, which means an efficient 
approximation algorithm is the best theoretical result that one could hope for. 

Intuitively, the seller strategy we propose is based on an influence- and- exploit idea, and it consists of 
categorizing each potential buyer as either an influencer or a revenue source. The influencers are offered the 
product for free and the revenue sources are offered the product at a pre-determined price, chosen based 
on the exact probability model. Briefly, the categorization is done by finding a spanning tree of the social 
network with as many leaves as possible, and then marking the leaves as revenue sources and the internal 
nodes as influencers. We can find such a tree in near-linear time [7, 10 . Cashback amounts are chosen to be 
a fixed fraction of the total revenue expected from this process. The full details are presented in section [3] 

In practice, we propose using this approach to find a strategy that has good global properties, and then 
using local search to improve it further. This kind of combination has been effective in the past, for example 
on the k- means problem [l]. Indeed, experiments (see section [4| show that combining local search with the 
above influence- and-exploit strategy is more effective than using either approach on its own. 



1.2 Related work 

The problem of social contagion or spread of influence was first formulated by the sociological community, 
and introduced to the computer science community by Domingos and Richardson [2]. An influential paper 
by Kempe, Kleinberg and Tardos [5] solved the target set selection problem posed by [2] and sparked interest 
in this area from a theoretical perspective (see 6 ). This work has mostly been limited to the influence 
maximization paradigm, where influence has been taken to be a proxy for the revenue generated through 
a sale. Although similar to our work in spirit, there is no notion of price in this model, and therefore, our 
central problem of setting prices to encourage influence spread requires a more complicated model. 

A recent work by Hartline, Mirrokni and Sundararajan |4 is similar in flavor to our work, and also 
considers extending social contagion ideas with pricing information, but the model they examine differs from 
our model in a several aspects. The main difference is that they assume that the seller is allowed to approach 
arbitrary nodes in the network at any time and offer their product at a price chosen by the seller, while in 
our model the cascade of recommendations determines the timing of an offer and this cannot be directly 
manipulated. In essence, the model proposed in [4] is akin to advertising the product to arbitrary nodes, 
bypassing the network structure to encourage a desired set of early adopters. Our model restricts such direct 

1 More precisely, the strategy achieves a constant-factor approximation for any fixed model, independent of the social network. 
If one changes the model, the approximation factor does vary, as made precise in Section [3| 
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advertising as it is likely to be much less effective than a direct recommendation from a friend, especially when 
the recommender has an incentive to convince the potential buyer to purchase the product (for instance, 
the recommender might personalize the recommendation, increasing its effectiveness). Despite the different 
models, the algorithms proposed by us and El are similar in spirit and are based on an influence- and- exploit 
strategy. 

This work has also been inspired by a direction mentioned by Kleinberg p], and is our interpretation of 
the informal problem posed there. Finally, we point out that the idea of cashbacks has been implemented 



in practice, and new retailers are also embracing the idea [8^9,14 . We note that some of the systems being 



implemented by retailers are quite close to the model that we propose, and hence this problem is relevant in 
practice. 



2 The Formal Model 

Let us start by formalizing the setting stated above. We represent the social network as an undirected graph 
G(V,E), and denote the initial set of adopters by S° C V. We also denote the active set at time t by 
S 1-1 (we call a node active if it has purchased the product and inactive otherwise). Given this setting, the 
recommendations cascade through the network as follows: at each time step t > 1, the nodes that became 
active at time t — 1 (i.e. S° for t = 1, and u G S t_1 \ S t ~ 2 for t > 2) send recommendations to their currently 
inactive friends in the network: 7V^ _1 = {v G V\S t ~ 1 \(u, v) G E,u G S t ~ 1 \S t ~ 2 }. Each such node v G 7V t_1 
is also given a price c Vjt G R at which it can purchase the product. This price is chosen by the seller to either 
be full price or some discounted fraction thereof. 

The node v must then decide whether to purchase the product or not (we discuss this aspect in the 
next section). If v does accept the offer, a fixed cashback r > is given to a recommender u G S 1-1 (note 
that we are fixing the cashback to be a positive constant for all the nodes as the nodes are assumed to 
be non-strategic and any positive cashback provides incentive for them to provide recommendations). If 
there are multiple recommenders, the buyer must choose only one of them to receive the cashback; this is a 
system that is quite standard in practice. In this way, offers are made to all nodes v G 7V t_1 through the 
recommendations at time t and these nodes make a decision at the end of this time period. The set of active 
nodes is then updated and the same process is repeated until the process quiesces, which it must do in finite 
time since any step with no purchases ends the process. 

In the model described above, the only degree of freedom that the seller has is in choosing the prices and 
the cashback amounts. It wants to do this in a way that maximizes its own expected revenue (the expectation 
is over randomness in the buyer strategies). Since the seller may not have any control over the seed set, we 
are looking for a strategy that can maximize the expected revenue starting from any seed set on any graph. 
In most online scenarios, producing extra copies of the product has negligible cost, so maximizing expected 
revenue will also maximize expected profit. 

Now we can formally state the problem of finding a revenue maximizing strategy as follows: 

Problem 1. Given a connected undirected graph G(V,E), a seed set S° , a fixed cashback amount r, and 
a model M for determining when nodes will purchase a product, find a strategy that maximizes the expected 
revenue from the cascading process described above. 

We are particularly interested in non-adaptive policies, which correspond to choosing a price for each 
node in advance, making the price independent of the time of the recommendation and the state of the 
cascade at the time of the offer. Our goal will be threefold: (1) to show that this problem is NP-hard even 
for simple models M, (2) to construct a constant-factor approximation algorithm for a wide variety of models, 
and (3) to show that restricting to non- adaptive policies results in at most a constant factor loss of profit. 

To simplify the exposition, we will assume the cashback r = for now. At the end of Section 4, we will 
show how the results can be generalized to work for positive r, which should be sufficient incentive for buyers 
to pass on recommendations. 
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2.1 Buyer decisions 

In this section, we discuss how to model the probability that a node will actually buy the product given a set 
of recommendations and a price. We use a very general model in this work that naturally extends the most 
popular traditional models proposed in the influence maximization literature, including both Independent 
Cascade and Linear Threshold. 

Consider an abstract model M for determining the probability that a node will buy a product given a 
price and what recommendations it has received. We allow M to take on virtually any form, imposing only 
the following conditions: 

1. The seller has full information about M. This is a standard assumption, and it can be approximated in 
practice by running experiments and observing people's behavior. 

2. A node will never pay more than full price for the product (we assume this full price is 1 without loss of 
generality). Without an assumption like this, the seller could potentially achieve unbounded revenue 
on a single network, which makes the problem degenerate. 

3. A node will always accept the product and recommend it to friends if it receives a recommendation 
with price (i.e. if a friend offers the product for free). Since nodes are given positive cash rewards 
for making recommendations, this condition is true for any rational buyer. 

4. If the social network is a single line graph with 5° being the two endpoints, the maximum expected 
revenue is at most a constant L. Intuitively, this states that each prospective buyer on a social network 
should have some chance of rejecting the product (unless it's given to them for free), and therefore the 
maximum revenue on a line is bounded by a geometric series, and is therefore constant. 

5. There exist constants /, c, q so that if more than fraction / of a given node's neighbors recommend 
the product to the node at cost c, the node will purchase the product with probability q. This rules 
out extreme inertia, for example the case where no buyer will consider purchasing a product unless 
almost all of its neighbors have already done so. 

The fourth and fifth conditions here are used to parametrize how complicated the model is, and our final 
approximation bound will be in terms of this model "complexity" , which is defined to be (jzjy^ • While it 
may not be obvious that all these conditions are met in general, we will show that they are for both the 
Independent Cascade and Linear Threshold models, and indeed, the arguments there extend naturally to 
many other cases as well. 

In the traditional Independent Cascade model, there is a fixed probability p that a node will purchase a 
product each time it is recommended to them. These decisions are made independently for each recommen- 
dation, but each node will buy the product at most once. 

To generalize this to multiple prices, it is natural to make p a function [0, 1] — > [0, 1] where p(x) represents 
the probability that a node will buy the product at price x. For technical reasons, however, it is convenient 
to work with the inverse of p, which we call Our general conditions on the model reduce to setting 
C(0) = 1 and C(l) = in this case. To ensure bounded complexity, we also impose a minor smoothness 
condition. 

Definition 1. Fix a cost function C : [0, 1] — > [0, 1] with C(0) = 1,C(1) = and with C differentiable at 
and 1. We define the Independent Cascade Model ICM C as follows: 

Every time a node receives a recommendation at price C{x), it buys the product with probability x and does 
nothing otherwise. If a node receives multiple recommendations, it performs this check independently for 
each recommendation but it never purchases the product more than once. 

Lemma 1. Fix a cost function C. Then: 

2 It is sometimes useful to consider functions p(-) that are not one-to-one. These functions have no formal inverse, but in 
this case, c can still be formally denned as C{x) = max y \p(y)\ > x. 
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1. ICMc has bounded (model) complexity. 

2. If C has maximum slope m (i.e. \C(x) — C(y)\ < m\x — y\ for all x,y), then ICMc has 0(m 2 ) 
complexity. 

3. If C is a step function with n regularly spaced steps (i.e. C(x) = C(y) if |_^J = then ICMc has 
0(n 2 ) complexity. 

Proof. We show that the complexity of ICMc can be bounded in terms of the maximum slope of C near and 
1. Recall that if C is differentiable at 0, then, by definition, there exists e > so that |C(x) ~ C(Q)l < |C"(0)| + 1 
for x < e. A similar argument can be made for x = 1, and thus we can say formally that there exist m and 
e > such that: 

C(x) > 1 — mx for x < e, and 
C(x) < m(l — x) for x > 1 — e. 

In this case, we will show that ICMc has complexity at most 8max(^,m) 2 , proving part 1. Note that parts 
2 and 3 of the lemma will also follow immediately. 

We begin by analyzing L n , the maximum expected revenue that can be achieved on a path of length n 
if one of the endpoints is a seed. Note that L < 2 max n L n since selling a product on a line graph with two 
seeds can be thought of as two independent sales, each with one seed, that are cut short if the sales ever 
meet. Now we have: 

L n = maxx(C(x) + L n _i). 

X 

This is because offering the product at cost C(x) will lead to a purchase with probability x, and in that case, 
we get C(x) revenue immediately and L n _i expected revenue in the future. Since L n is obviously increasing 
in n, this can be simplified further: 

L n < maxx(C(x) + L n ) 

X 

2x • C(x) 

=^ L < 2L n < max — 

0<x<l 1 — X 

For x > 1 - e, we have < 2x -™^~ x ) < 2m, and for x < 1 - e, we have 2a? 1 ^ a?) < \ . Either way, 

L < 2 max(^, m). 

It remains to choose /, c and q as per the first complexity condition. We use / = 0, q = min(e, ^) and 
c = C(q) > \. Indeed, if a node has more than active neighbors, it will accept a recommendation at cost 
C(q) with probability q. 

Thus ICM C has complexity at most jjzjy^ < 8max(^,m) 2 , as required. □ 

In the traditional Linear Threshold model, there are fixed influences b VjW on each directed edge (v,w) 
in the network. Each node independently chooses a threshold 6 uniformly at random from [0, 1], and then 
purchases the product if and when the total influence on it from nodes that have recommended the product 
exceeds 0. 

To generalize this to multiple prices, it is natural to make b v ^ w a function [0,1] — > [0,1] where b v ^ w (x) 
indicates the influence v exerts on w as a result of recommending the product at price x. To simplify the 
exposition, we will focus on the case where a node is equally influenced by all its neighbors. (This is not 
strictly necessary but removing this assumptions requires rephrasing the definition of / to be a weighted 
fraction of a node's neighbors.) Finally, we assume for all v,w that b V}W (0) = 1 to satisfy the second general 
condition for models. 

Definition 2. Fix a max influence function B : (0,1] — > [0,1], not uniformly 0. We define the Linear 
Threshold Model LTM B as follows: 
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Every node independently chose a threshold uniformly at random from [0, 1] . A node will buy the product 
at price x > only if B(x) > f where a denotes the fraction of the node's neighbors that have recommended 
the product. A node will always accept a recommendation if the product is offered for free. 

Lemma 2. Fix a max influence function B and let K = max x x • B{x). Then LTMb has complexity O(j^). 

We omit the proof since it is similar to that of Lemma [T] In fact, it is simpler since, on a line graph, a 
node either gets the product for free or it has probability at most \ of buying the product and passing on a 
recommendation. 



3 Approximating the Optimal Revenue 

In this section, we present our main theoretical contribution: a non-adaptive seller strategy that achieves 
expected revenue within a constant factor of the revenue from the optimal adaptive strategy. We show the 



problem of finding the exact optimal strategy is NP-hard (see section 8.1 in the appendix), so this kind of 
result is the best we can hope for. Note that our approximation guarantee is against the strongest possible 
optimum, which is perhaps surprising: it is unclear a priori whether such a strategy should even exist. 

The strategy we propose is based on computing a maximum-leaf spanning tree (MaxLeaf) of the under- 
lying social network graph, i.e., computing a spanning tree of the graph with the maximum number of leaf 
nodes. The MaxLeaf problem is known to be NP-Hard, and it is in fact also MAX SNP-Complete, but 
there are several constant-factor approximation algorithms known for the problem [3j[7j[ToJ[T5] . In particular, 
one of these is nearly linear-time [To], making it practical to apply on large online social network graphs. 
The seller strategy we attain through this is an influence- and- exploit strategy that offers the product to all 
of the interior nodes of the spanning tree for free, and charges a fixed price from the leaves. Note that this 
strategy works for all the buyer decision models discussed above, including multi-price generalizations of 
both Independent Cascade and Linear Threshold. 

We consider the setting of Problem [I] where we are given an undirected social network graph G(V, £"), a 
seed set S° C V and a buyer decision model M. Throughout this section, we will let L, /, c and q denote the 



quantities that parametrize the model complexity, as described in Section 2.1 To simplify the exposition, 
we will assume for now that the seed set is a singleton node (i.e., | | = 1). If this is not the case, the seed 
nodes can be merged into a single node, and we can make much the same argument in that case. We will 
ignore cashbacks for now, and return to address them at the end of the section. 
The exact algorithm we will use is stated below: 

• Use the MaxLeaf algorithm 10 to compute an approximate max- leaf spanning tree T for G that is 
rooted at So- 

• Offer the product to each internal node of T for free. 

• For each leaf of T (excluding So), independently flip a biased coin. With probability offer the 
product to the node for free. With probability offer the product to the node at cost c. 

We henceforth refer to this strategy as StrategyMaxLeaf. 

Our analysis will revolve around what we term as "good" vertices, defined formally as follows: 

Definition 3. Given a graph G(V,E), we define the good vertices to be the vertices with degree at least 3 
and their neighbors. 

On the one hand, we show that if G has g good vertices, then the MaxLeaf algorithm will find a spanning 
tree with Q(g) leaves. We then show that each leaf of this tree leads to £1(1) revenue, implying Strategy- 
MaxLeaf gives Q(g) revenue overall. Conversely, we can decompose G into at most g line-graphs joining 
high-degree vertices, and the total revenue from these is bounded by gL = 0(g) for all policies, which gives 
the constant-factor approximation we need. 

We begin by bounding the number of leaves in a max-leaf spanning tree. For dense graphs, we can rely 
on the following fact fflflO]: 
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Fact 1. The max-leaf spanning tree of a graph with minimum degree at least 3 has at least n/ 4+2 leaves J^fTdjj. 



In general graphs, we cannot apply this result directly. However, we can make any graph have minimum 
degree 3 by replacing degree- 1 vertices with small, complete graphs and by contracting along edges to remove 
degree-2 vertices. We can then apply Fact [I] to analyze this auxiliary graph, which leads to the following 
result: 

Lemma 3. Suppose a connected graph G has n 3 vertices with degree at least 3. Then G has a spanning tree 
with at least ^ + 1 leaves. 

Proof. Let rt\ and 77,2 denote the number of vertices of degree 1 and 2 respectively, and let M denote the 
number of leaves in a max-leaf spanning tree of G. If n\ = = 0, the result follows from Fact [I] 

Now, suppose U2 = but n\ > 0. Clearly, every spanning tree has at least ri\ leaves, so the result is 
obvious if rii > ^ + 1. Otherwise, we replace each degree- 1 vertex with a copy of K4 (the complete graph on 
4 vertices), one of whose vertices connects back to the rest of the graph. Let G' denote the resulting graph. 
Then G' has 4ni + 77,3 vertices, and they are all at least degree 3, so G' has a spanning tree T' with at least 
ni + ^ + 2 leaves. 

We can transform this into a spanning tree T on G by contracting each copy of K4 down to a single point. 
Each contraction could transform up to 3 leaves into a single leaf, but it will not affect other leaves. Since 
there are exactly n\ contractions that need to be done altogether, T has at least n\ + ^ + 2 — 2rt\ > ^ + 1 
leaves, as required. 

We now prove the result holds in general by induction on 77,2. We have already shown the base case 
(712 = 0). For the inductive step, we will define an auxiliary graph G' with n f 2 ,n 3 and M' defined as for G. 
We will then show n' 2 < ri2,n 3 > 77,3, and for every spanning tree T' on G', there is a spanning tree T on 
G with at least as many leaves. This implies M' < M . and using the inductive hypothesis, it follows that 
M>M'>^ + 1>^ + 1, which will complete the proof. 

Towards that end, suppose v is a degree-2 vertex in G, and let its neighbors be u and w. If u and w are 
not adjacent, we let G' be the graph attained by contracting along the edge (u,v). Then n' 2 = ri2 — 1 and 
n 3 =77,3. Any spanning tree T' on G' can be extended back to a spanning tree T on G by uncontracting the 
edge (w, v) and adding it to T. This does not decrease the number of leaves in the tree, so we are done. 

Next, suppose instead that u and w are adjacent. We cannot contract (u,v) here since it will create 
a duplicate edge in G' . However, a different construction can be used. If the entire graph is just these 3 
vertices, the lemma is trivial. Otherwise, let G' be the graph attained by adding a degree- 1 vertex x adjacent 
to v. Then n' 2 = ri2 — 1 and n 3 =77,3 + 1. Now consider a spanning tree T' of G' . We can transform this into 
a spanning tree T on G by removing the edge (y, x) that must be in T' . This removes the leaf x but if v has 
degree 2 in T', it makes v a leaf. In this case, T and T' have the same number of leaves, so we are done. 

Otherwise, (u,v) and (v,w) are also in T 7 , and since G was assumed to have more than 3 vertices, u 
and w cannot both be leaves in T' . Assume without loss of generality that u is not a leaf. We then further 
modify T by replacing (v, w) with (u, w). Now, v is a leaf in T and the only vertex whose degree has changed 
is which is not a leaf in either T or T' . Therefore, T and T' again have the same number of leaves, and 
we are once again done. 

The result now follows from induction, as discussed above. □ 

We must further extend this to be in terms of the number of good vertices g, rather than being in terms 
of n 3 : 

Lemma 4. Given an undirected graph G with g good vertices, the MaxLeaf algorithm TTdj will construct a 
spanning tree with max(^ + 0.5, 2) leaves. 

Proof. If g = 0, the result is trivial. Otherwise, let 77,3 denote the number of vertices in G with degree at 
least 3, and let M denote the number of leaves in a max-leaf spanning tree of G. By Lemma [3J we know 
M>f + 1. 

Now consider constructing a spanning tree as follows: 
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1. Let A denote the set of vertices in G with degree at least 3. 

2. Set T to be a minimal subtree of G that connects all vertices in A. 

3. Add all remaining vertices in G to T one at a time. If a vertex v could be connected to T in multiple 
ways, connect it to a vertex in A if possible. 

To analyze this, note that G — A can be decomposed into a collection of "primitive" paths. Given a primitive 
path P, let gp denote the number of good vertices on P and let lp denote the number of leaves T has on P. 

In Step 2 of the algorithm above, exactly 713 — 1 of these paths are added to T. For each such path P, we 
have gp < 2 and lp = 0. On the remaining paths, we have gp = lp. Therefore, the total number of leaves 
on T is at least 

= (g~ n 3 ) + ^2(h ~ 9p) > (9- n 3 ) - 2(n 3 - 1). 
p p 

Thus, 

M > max(y + l,0-3n 3 + l) 

The result now follows from the fact that the MaxLeaf algorithm gives a 2- approximation for the max-leaf 
spanning tree, and that every non-degenerate tree has at least two leaves. □ 

We can now use this to prove a guarantee on the performance of StrategyMaxLeaf in terms of the 
number of good vertices on an arbitrary graph: 

Lemma 5. Given a social network G with g good vertices, StrategyMaxLeaf guarantees an expected 
revenue of — f)cq ■ g). 

Proof Let T denote the spanning tree found by the MaxLeaf algorithm. Let U denote the set of interior 
nodes of T, and let V denote the leaves of T (excluding So). Since we assumed |So| = 1, Lemma [4] guarantees 
\V\ >max(^- 0.5,1) = Sl(g). 

Note every vertex can be reached from So by passing through nodes in U, each of which is offered the 
product for free. These nodes are guaranteed to accept the product, and therefore, they will collectively pass 
on at least one recommendation to each vertex. 

Now consider the expected revenue from a vertex v G V. Let M be the random variable giving the 
fraction of u's neighbors in V that were not offered the product for free. We know E[M] = so with 
probability ^, we have M < 1 — /. 

In this case, v is guaranteed to receive recommendations from a fraction / of its neighbors in V, as well 
as all of its neighbors in UUSq (of which there is at least 1). If we charge v a total of c for the product, it will 
then purchase the product with probability at least q, by the original definitions of /, c and q. Furthermore, 
independent of u's neighbors, we will ask this price from v with probability Therefore, our expected 

revenue from v is at least \ • q • • c. 

The result now follows from linearity of expectation. □ 

Now that we have computed the expected revenue from StrategyMaxLeaf, we need to characterize the 
optimal revenue to bound the approximation ratio. This bound is given by the following lemma. 

Lemma 6. The maximum expected revenue achievable by any strategy (adaptive or not) on a social network 
G with g good vertices is 0(L • g). 
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Proof. Let A denote the set of vertices in G with degree at least 3, and let ns = \A\. Clearly, no strategy 
can achieve more than ns revenue directly from the nodes in A. 

As observed in the proof of Lemma [4j however, G — A can be decomposed into a collection of primitive 
paths. Since each primitive path contains at least one unique good vertex with degree less than 3, there is 
at most g — ns such paths. Even if each endpoint of a path is guaranteed to recommend the product, the 
total revenue from the path is at most L. 

Therefore, the total revenue from any strategy on such a graph is at most ns + (g — ns)L = 0(L • g). □ 

Now, we can combine the above lemmas to state the main theorem of the paper, which states that 
StrategyMaxLeaf provides a constant factor approximation guarantee for the revenue. 

Theorem 1. Let K denote the complexity of our buyer decision model M. Then, the expected revenue gener- 
ated by StrategyMaxLeaf on an arbitrary social network is O(K) -competitive with the expected revenue 
generated by the optimal (adaptive or not) strategy. 

Proof. This follows immediately from Lemmas jHJ and |6j as well as the fact that K = jjzjj^ • □ 

As a corollary, we get the fact that the adaptivity gap is also constant: 
Corollary 1. Let K denote the complexity of our buyer decision model M. Then the adaptivity gap is O(K). 

Now we briefly address the issue of cashbacks that were ignored in this exposition. We set the cashback 
r to be a small fraction of our expected revenue from each individual ro, i.e. r = z • r*o, where z < 1. Then, 
our total profit will be n • ro • (1 — z). Adding this cashback decreases our total profit by a constant factor 
that depends on z, but otherwise the argument now carries through as before, and nodes now have a positive 
incentive to pass on recommendations. 

In light of Corollary [I] one might ask whether the adaptivity gap is not just 1. In other words, is there 
any benefit at all to be gained from using non- adaptive strategies? In fact, there is. For example, consider 
a social network consisting of 4 nodes {vi,V2 ,^3 ,^4} in a cycle, with vs connected to two other isolated 
vertices. Suppose furthermore that a node will accept a recommendation with probability 0.5 unless the 
price is 0, in which case the node will accept it with probability 1. On this network, with seed set S° = {^i}, 
the optimal adaptive strategy is to always demand full price unless exactly one of V2 and V4 purchases the 
product initially, in which case vs should be offered the product for free. This beats the optimal non-adaptive 
strategy by a factor of 1.0625. 

4 Local Search 

In this section, we discuss how an arbitrary seller strategy can be tweaked by the use of a local search 
algorithm. Taken on its own, this technique can sometimes be problematic since it can take a long time to 
converge to a good strategy. However, it performs very well when applied to an already good strategy, such 
as StrategyMaxLeaf. This approach of combining theoretically sound results with local search to generate 
strong techniques in practice is similar in spirit to the recent k-means++ algorithm fil. 
Intuitively, the local search strategy for pricing on social networks works as follows: 

• Choose an arbitrary seller strategy S and an arbitrary node v to edit. 

• Choose a set of prices {pi,P2, • • • ,Pk} to consider. 

• For each price p^, empirically estimate the expected revenue ri that is achieved by using the price pi 
for node v. 

• If any revenue n beats the current expected revenue (also estimated empirically) by some threshold e, 
then change S to use the price pi for node v. 

• Repeat the preceding steps for different nodes until there are no more improvements. 
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Henceforth, we call this the LocalSearch algorithm for improving seller strategies. 

To empirically estimate the revenue from a seller strategy, we can always just simulate the entire process. 
We know who has the product initially, we know what price each node will be offered, and we know the 
probability each node will purchase the product at that price after any number of recommendations. Simu- 
lating this process a number of times and taking the average revenue, we can arrive at a fair approximation 
at how good a strategy is in practice. In fact, we can prove that performing local search on any input policy 
will ensure that the seller gets at least as much revenue as the original policy with high probability. The 
proof of this fact holds for any simulatable input policy, and proceeds by induction on the evolution tree of 
the process. The proof is somewhat technical, so we will skip it, and instead focus on the empirical question 
of the advantage provided by local search. 

In light of the fact that local search can only improve the revenue (and never hurt it), it seems that one 
should always implement local search for any policy. There is a important technical detail that complicates 
this, however. Suppose we wish to evaluate strategies Si and S2, differing only on one node v. If we 
independently run simulations for each strategy, it could take thousands of trials (or more!) before the 
systematic change to one node becomes visible over the noise resulting from random choices made by the 
other nodes. It is impractical to perform these many simulations on a large network every time we want to 
change the strategy for a single node. 

Fortunately, it is possible to circumvent this problem using an observation first noted in py. Let us 
consider the Linear Threshold model LTM#. In this case, all randomness occurs before the process begins 
when each node chooses a threshold that encodes how resistant it is to buying the product. Once these 
thresholds have been fixed, the entire sales process is deterministic. We can now change the strategy slightly 
and maintain the same thresholds to isolate exactly what effect this strategy change had. Any model, 
including Independent Cascade, can be rephrased in terms of thresholds, making this technique possible. 

The LocalSearch algorithm relies heavily on this observation. While comparing strategies, we choose 
several threshold lists, and simulate each strategy against the same threshold lists. If these lists are not 
representative, we might still make a mistake drawing conclusions from this, but we will not lose a universally 
good signal or a universally bad signal under the weight of random noise. 

With this implementation, empirical tests (see the next section) show the LocalSearch algorithm does 
do its job: given enough time, it will improve virtually any strategy enough to be competitive. It is not a 
perfect solution, however. First of all, it can still make small mistakes while doing the random estimates, 
possibly causing a strategy to become worse over tim^] Secondly, it is possible to end up with a sub-optimal 
strategy that simply cannot be improved by any local changes. Finally, the LocalSearch algorithm can 
often take many steps to improve a bad strategy, making it occasionally too slow to be useful in practice. 

Nonetheless, these drawbacks really only becomes a serious problem if one begins with a bad strategy. If 
one begins with a relatively good strategy - for example StrategyMaxLeaf - the LocalSearch algorithm 
performs well, and is almost always worth doing in practice. We justify this claim in the next section. 



4.1 Experimental Results 

In this section, we provide experimental evidence for the efficacy of the LocalSearch algorithm in improving 
the revenue guarantee. Note that in these experiments, we need to assume a benchmark strategy as finding 



the optimal strategy is NP-hard (see section 8.1). We pick a very simple strategy RandomPricing, which 
picks a random price independently for each node. The results demonstrate that even this naive strategy 
can be coupled with the LocalSearch algorithm to do well in practice. 

We simulate the cascading process on two kind of graphs. The first graph we study is a randomly 
generated graph, based on the preferential attachment model that is a popular model for representing 
social networks [12]. We generate a 1000 node preferential attachment graph at random, and simulate the 
cascading process by picking a random node as the seed in the network. The probability model we examine 
is a step function (see the second example given in Lemma [T]) of probabilities. We note that the function is 

3 Note that if we choose e and the number of trials carefully, we can make this possibility vanishingly small (this is also the 
intuition behind the local search guarantee, as we had mentioned earlier. In practice, however, it is usually better to run fewer 
trials and accept the possibility of regressing slightly. 
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Figure 1: The variation in revenue generated by RandomPricing and StrategyMaxLeaf with the itera- 
tions of the LocalSearch algorithm. The data is averaged over 10 runs of a 1000 node random preferential 
attachment graph (a) or a 10000 node subgraph of YouTube (b) , starting with a random seed each time. 
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necessarily arbitrary. The result of one particular parameter settings are shown in figure |l(a)| which plots 
average revenue obtained by the two pricing strategies: RandomPricing and StrategyMaxLeaf. Each 
point on the figure is obtained by average revenue over 10 runs on the same graph but with a different 
(random) seed. The horizontal axis indicates the number of LocalSearch iterations that were done on the 
graph, where each iteration consisted of simulating the process 50 times, and choosing the best value over 
the runs. It is clear from the graph that StrategyMaxLeaf does quite well even without the addition of 
LocalSearch, although the addition of LocalSearch does increase the revenue. On the other hand, the 
RandomPricing strategy performs poorly on its own, but its revenue increases steadily with the iterations 
of the LocalSearch algorithm. We note that the difference between the revenue from the two policies does 
vary (as expected) with the probability model, and the difference between the revenue is not as large in all 
the different runs. But the difference does persist across the runs, especially when the strategies are run 
without the local search improvement. 

We also conduct a similar simulation with a real-world network, namely the links between users of the 
video-sharing site YouTubeQ The YouTube network has millions of nodes, and we only study a subset of 
10,000 nodes of the network. We simulate the random process as earlier, and the results are shown in 
figure l(b)| Again, we note that StrategyMaxLeaf does very well on its own, easily beating the revenue 
of RandomPricing. The RandomPricing strategy does improve a lot with LocalSearch, but it fails to 
equalize the revenue of StrategyMaxLeaf. The large size of the YouTube graph and the expensive nature 
of the LocalSearch algorithm restrict the size of the experiments we can conduct with the graph, but 
the results from the above does experiments do offer some insights. In particular, StrategyMaxLeaf 
succeeds in extracting a good portion of the revenue from the graph, if we consider the revenue obtained 
from StrategyMaxLeaf combined with LocalSearch based improvements to be the benchmark. Further, 
LocalSearch can improve the revenue from any strategy by a substantial margin, though it may not be able 
to attain enough revenue when starting with a sub-optimal strategy such as RandomPricing. Finally, we 
observe that the combination of StrategyMaxLeaf and LocalSearch generates the best revenue among 
our strategies, and it is an open question as to whether this is the optimal adaptive strategy. 



5 Conclusions 

In this work, we discussed pricing strategies for sellers distributing a product over social networks through 
viral marketing. We show that computing the optimal (one that maximizes expected revenue) non-adaptive 
strategy for a seller is NP-Hard. In a positive result, we show that there exists a non-adaptive strategy 
for the seller which generates expected revenue that is within a constant factor of the expected revenue 
generated by the optimal adaptive strategy. This strategy is based on an influence- and- exploit policy which 
computes a max-leaf spanning tree of the graph, and offers the product to the interior nodes of the spanning 
tree for free, later on exploiting this influence by extracting its profit from the leaf nodes of the tree. The 
approximation guarantee of the strategy holds for fairly general conditions on the probability function. 



6 Open Questions 

The added dimension of pricing to influence maximization models poses a host of interesting questions, many 
of which are open. An obvious direction in which this work could be extended is to think about influence 
models stronger than the model examined here. It is also unclear whether the assumptions on the function 
C(-) are the minimal set that is required, and it would be interesting to remove the assumption that there 
exists a price at which the probability of acceptance is 1. A different direction of research would be to 
consider the game-theoretic issues involved in a practical system. Namely, in the model presented here, we 
think of each buyer as just sending the recommendations to all its friends and ignore the issue of any "cost" 
involved in doing so, thereby assuming all the nodes to be non-strategic. It would be very interesting to 

4 The network can be freely downloaded; see ll] for details. 
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model a system where the nodes were allowed to behave strategically, trying to maximize their payoff, and 
characterize the optimal seller strategy (especially w.r.t. the cashback) in such a setting. 
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8 Appendix 

8.1 Hardness of finding the optimal strategy 

In this section, we show that Problem [I] is NP-hard even for a very simple buyer model M by a reduction 
from vertex cover with bounded degree (see [3] for the hardness of bounded-degree vertex cover) . Letting d 
denote the degree bound, and letting p = we will use an Independent Cascade Model ICMc with: 

C{x) = \\ * X < P > 
v 7 [0 if x > p 

Intuitively, the seller has to partition the nodes into "free" nodes and "full-price" nodes. In the former 
case, nodes are offered the product for free, and they accept it with probability 1 as soon as they receive 
a recommendation. In the latter case, nodes are offered the product for price 1, and they accept each 
recommendation with probability p. (Note that the seller is allowed to use other prices between and 1 but 
a price of 1 is always better.) 

We are going to use a special family of graphs illustrated in Figure [2j The graph consists of four layers: 

• A singleton node s, which we will use as the only initially active node (i.e., S° = {s}); 

• s links to a set of n nodes, denoted by V\\ 

• Nodes in V\ also link to another set of nodes, denoted by V2. Each node in V\ will be adjacent to d 
nodes in V2, and each node in V2 will be adjacent to 2 nodes in V\ (so |V^| = dn/2); 

• Each node v G V2 also links to k = 20d new nodes, denoted by W v ; these k nodes do not link to any 
other nodes. The union of all W v 's is denoted by V3. 

We first sketch the idea of the hardness proof. The connection between V\ and V2 will be decided by the 
vertex cover instance: given a vertex cover instance G'(V,E) with bounded degree d, we construct a graph 
G as above where V\ = V and V2 = E, adding an edge between V\ and V2 if the corresponding vertex is 
incident to the corresponding edge in G' . The key lemma is that, in the optimal pricing strategy for G, the 
subset of nodes in V\ that are given the product for free is the minimum set that covers V2 (i.e., a minimum 
vertex cover of G'). 

To formalize this, first note that, in an optimal strategy, all nodes in V3 should be full-price. Giving the 
product to them for free gets immediate revenue, and offers no long-term benefit since nodes in V3 cannot 
recommend the product to anyone else. If the nodes are full-price, on the other hand, there is at least a 
chance at some revenue. 

On the other hand, we show the optimal strategy must also ensure each vertex in V2 eventually becomes 
active with probability 1. 
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Figure 2: Reducing Bounded-Degree Vertex Cover to Optimal Network Pricing 



Lemma 7. In an optimal strategy, every node v £ V2 is free, and can be reached from s by passing through 
free nodes. 

Proof. Suppose, by way of contradiction, that the optimal strategy has a node v £ V2 that does not satisfy 
these conditions. Let u\ and U2 be the two neighbors of v in Vi, and let q denote the probability that v 
eventually becomes active. 

We first claim that q < 2dp. Indeed, if v is full-price, then even if u\ and become active, the probability 
that v becomes active is 1 — (1 — p) 2 < 2p. Otherwise, u\ and U2 are both full-price. Since u\ and U2 connect 
to at most 2d edges other than v, the probability that one of them becomes active before v is at most 
1 - (1 - p) 2d < 2dp. Thus, q < 2dp. 

It follows that the total revenue that this strategy can achieve from m, v, and W v is 2 + kqp < 2-\-2kdp 2 = 
4.5. Conversely, if we make u\ and v free, we can achieve kp = 5 revenue from the same buyers. Furthermore, 
doing this cannot possibly lose revenue elsewhere, which contradicts the assumption that our original strategy 
was optimal. □ 

It follows that, in an optimal strategy, all of V3 is full-price, all of V2 is free, and every node in V2 is 
adjacent to a free node in V\. It remains only to determine C, the nodes in Vi, that an optimal strategy 
should make free. At this point, it should be intuitively clear that C should correspond to a minimum 
vertex-cover of V2. We now formalize this as follows: 

Lemma 8. Let C denote the set of free nodes in V\, as chosen by an optimal strategy. Then C corresponds 
to a minimum vertex cover of G' . 

Proof. As noted above, every node in V2 must be adjacent to a node in C, which implies C does indeed 
correspond to a vertex cover in G' . 

Now we know an optimal strategy makes every node in V2 free, and every node in V3 full-price. Once 
we know C, the strategy is determined completely. Let xc denote the expected revenue obtained by this 
strategy. Since all nodes in V2 are free and are activated with probability 1, we know the strategy achieves 
revenue from V2 and p\Vs\ expected revenue from V3. 

Among nodes in Vi, the strategy achieves revenue for free nodes, and exactly 1 — (1 — p) dJrl expected 
revenue for each full-price node. This is because each full-price node is adjacent to exactly d + 1 other nodes, 
and each of these nodes is activated with probability 1. Therefore, xc = (|Vl| — |C|)(1 — (1 — +p\ V3I, 

which is clearly minimized when C is a minimum- vertex cover. □ 

Therefore, optimal pricing, even in this limited scenario, can be used to calculate the minimum-vertex 
cover of any bounded-degree graph, from which NP-hardness follows. 

Theorem 2. Two Coupon Optimal Strategy Problem is NP-Hard. 
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